perm filename SUM[P,JRA]1 blob sn#494091 filedate 1980-01-12 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00008 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	.cent(Section I. Introduction)
C00014 00003	.cent(Section II. Specification Language)
C00021 00004	.P2:
C00023 00005	.cent(Section III. Mapping to Intermediate Language)
C00027 00006	.cent(Section IV. Correctness of the Mapping to Intermediate Form)
C00033 00007	.cent(Section V. Conclusions and Further Research)
C00043 00008	.cent(References)
C00056 ENDMK
C⊗;
.cent(Section I. Introduction)
Software is expensive and, too often, bug ridden.  Debugging is still the most
commonly used method of increasing confidence in the correctness of a program.
Proving the correctness of programs is difficult.
Many approaches have been proposed
to ensure reliable programs, some dealing with the  programming task and others
with proving a completed program meets its specifications.

In some approaches a program's "specifications" are derived from the completed program.
However, we strongly believe that
specifications should be written first; then a program
may be constructed to fit its specifications just as a house is constructed 
according to its blueprints. In this way a proof that a program meets 
certain specifications can
 be developed along with the program itself, or guaranteed through the
use of rules that have already been proven valid to transform the specification
into a program.

The specification should be written in a high level language
that enables one to describe what is to be accomplished, indicating functional
relationships, without considering computational details.  Such a specification,
in which only the abstract description of the program is required, is less prone
to error than a typical programming language specification in which every detail of
the computation must be provided.  

Program synthesis is the generation of a computational specification of a problem or
task from a descriptive specification.  A common criticism of this approach is
that it lacks the redundancy 
of two specifications, one descriptive and one computational, both written
by the programmer; however, in practice, one is derived from the
other, and we feel that the descriptive specification, being free of 
computational details and thus less prone to error, is the appropriate place to
start.  Several approaches to program synthesis
are currently being investigated.

With the deductive approach
(7, 8, 12, 14, 15, 19, 34, 35),
the computer is given specific rules by which it can rewrite statements
syntactically while maintaining equivalent meaning.
This kind of program synthesis is closely related to program transformation
(4, 9, 10, 16, 22, 33). The intent of
a program transformation system is to make the given program more efficient
while preserving its meaning. 

Less formal techniques have also found favor.  Natural language dialogues have
been used to describe a problem to the computer (3, 23, 27, 32, 40).
The user of this kind of 
system specifies the task at a general level, filling in the details as requested
by the computer.
Still other approaches employ specification by example.
Some systems synthesize
programs from sample input-output pairs (26, 39),
others provide sample execution traces (6) or even analogous programs (41).
Several of the approaches mentioned above are combined in the
PSI program synthesis system at Stanford University (25).

Each of the techniques mentioned above has been investigated as a means to
synthesize programs in a specific target language.
It is our thesis that the generation of an algorithm is a
target-language-independent process, and a system that generates programs from
their specifications is a valid and viable way to obtain correct programs.

We have implemented a system to generate programs from logic specifications.
The specifications are first translated into an intermediate language
and then a program is generated from the intermediate form. 
Validity of the system was established by proving that the 
top level mappings from specification to
intermediate language to target program (the proof is for the mapping to LISP)
are correct. Several programs of various sizes have been generated.

 We drop the word
"synthesis", and use program "generation", instead to avoid any misunderstanding
of the claims being made.  
The specification
language for this system includes a subset of first-order Predicate
Calculus known as Horn Clauses.
The specifications we require are "descriptive" in that
they specify the logic of the program,
 but they are
also, in part, "computational" in that the Horn clauses could be "run" as programs
given a complete theorem prover, or logic interpreter (2, 20, 30).

In section II we describe the specification language.  A formal description of the 
language via a context-free grammar for the syntax and an axiomatic description
of the semantics can be found elsewhere (18).
In section III we see how a specification is mapped into a deterministic algorithm
in the intermediate language.  Again, a formal treatment of the language and the
mapping may be found in (18).

In section IV we outline the proof that the algorithm generated satisfies 
the specifications.  Section V contains conclusions that can be drawn from this
effort, as well as directions for continuing research.


.cent(Section II. Specification Language)
.turn on "↑_↓&#"
.turn off "¬"

 

The input to the system is a sequence of definitions.  A target definition
indicates the target language to be used.
  The other kinds of definitions are function, type and generic, and are described
in the following.

A function definition has six parts: a name, an input pattern, a formal parameter 
list, a precondition, a postcondition, and a body. 

.begin nofill
.group
.fp
For example:

			%3function Fact
			%2input pattern? %3(1 0)%2
			parameter list? %3(x y)%2
			precond? %3Integer(x, true) ∧ ≥(x, 0, true).%2
			postcond? %3Integer(y, true) ∧ >(y, 0, true).%2
			body? %3Fact(0, 1)
			       Fact(z, w) ← Sub1(z, z1), Fact(z1, w1), *(z, w1, w).%1

.apart
.end
.fp
where only the italicized information is typed by the user.
An input pattern is a list of 1's and 0's
indicating which of the arguments of the function being defined are expected
to hold input values and which are used to carry values produced during the
computation of the function, respectively.

A formal parameter list is required to match up the appropriate arguments
for the precondition and postcondition specifications.

The precondition is a well-formed-formula of predicate calculus,
 defined on the input variables, that is true only
if the function is defined for those input variables.
By precisely specifying  the domain of the function we can guarantee
termination of the program.
Similarly, the postcondition is a predicate 
formula on the output variables
that specifies the range of the output.
Taken together, the precondition and postcondition
specify the functionality, i.e. the domain and range, of the program being defined.
The range given may in fact be larger than the actual range of the function,
but the domain given must not include extraneous elements.

A body definition is a set of Horn clauses (30) that describes the function, usually
recursively, in an axiomatic way.  Each Horn clause is an implication, stating
that a goal (the consequent) can be asserted if the conjunction of a set of 
subgoals (the antecedent) is satisfied.  In writing
a body definition we are asserting that each of these implications is true;
that is, each may be considered an axiom and be used in deriving theorems.

If the antecedent of an implication is empty, the clause is considered an assertion
and we do not write the arrow, as in %3Fact(0, 1)%1 (read "Fact of 0 is 1").
Kowalski (30) describes the interpretation of Horn
clauses as a programming language.

A type definition is a definition of a function that recognizes
occurrences of the type. 
A generic function is one in which the input pattern is allowed to vary.
For example, one might wish to define a function "%3Times(x, y, z)%1", meaning
" %3x%1 times %3y%1 is
%3z%1".  If defined as a generic
function we can use %3Times%1 in defining other functions whenever we
have two of its arguments available and wish to compute the third.

A generic definition
includes its name, a formal parameter
list, a
list of choices specifying how the function may be used, and one or
more body definitions.  With Horn Clauses, it is often the case that one
description of the body can be used for several calling styles (37).
A choice consists of an input pattern, a name to be associated with this
particular style of call on the function, a precondition, a postcondition, and a
body name indicating the function body to be used.  
A body definition
associates a body name with a definition (i.e. a set of Horn clauses).
.P2:

%1The semantics of a function specification,
informally, is that for all
possible instantiations of the formal parameter list, if each parameter
designated as an input by the input pattern has a value bound to it and
the value of the precondition evaluated on the inputs is true (i.e., the 
inputs lie within the domain of definition of the function), then the
conjunction  of the Horn clauses and the postcondition (applied to any tuple
in the relation) are true.
A formal statement is given in (18).
.begin "form" nofill
.fp
Our %3Fact%1 example has the semantics of the associated first order logic expression:

%2∀x,y [ defined[(x)] ∧ Integer(x, true) ∧ ≥(x, 0, true)   →
	Fact(0, 1)
	  ∧  ∀z,z1,w,w1 [ [Sub1(z, z1) ∧ Fact(z1, w1) ∧ *(z, w1, w)  →  Fact(z, w)] ]
	       ∧ [ Fact(x, y)	→ Integer(y, true) ∧ >(y, 0, true) ]  ]

.end "form"
%1
The generic specification has the effect of defining several different functions
and associating them all with a "generic" name that can be used when specifying
other functions rather than
using a different name every time the function is used in
a different way, that is, with a different input-pattern.

.cent(Section III. Mapping to Intermediate Language)

The intermediate language form of function definitions is only used internal
to the system.
The auxiliary specifications (all but body specifications) are essentially
unchanged; their representation is slightly different but exactly the same
information is expressed.  Thus, the interesting part of the 
intermediate form is its treatment of the body specification.

The body of a function in internal form is a "backtracking-conditional" ("bktrkcond"
for short).
Each clause of the body specification becomes an "alternative" in the bktrkcond.  An
alternative consists of a "match"-part, that is
an argument list to be matched against the actual
parameters of a function call, and a "try" part consisting of the subgoals
to be accomplished.
The backtracking involved is well-behaved since we may backtrack only
over entire alternatives, never over individual subgoals in the try-part of an
alternative.  All that may have to be undone are the bindings made in accomplishing
the unification of the actual parameter list with the argument list in the 
match-part of the alternative.
We impose determinism on the program at this stage by insisting on a particular
order for considering the alternatives.

The intermediate form of a generic definition contains the generic name,
the formal parameter list, and a list associating input-patterns
with the function name to be used when that input-pattern is recognized. 
A generic specification
causes the function definitions for each alternative version of the function
to be made, as if they had each been specified separately.

The specification of the factorial function
has the following form in the Intermediate Language:⊗↓The "!" is being used as
a prefix to designate variables for the purpose of renaming. ←
.begin nofill;select 3

.group
		%3(function Fact (1 0) (!x !y) 
			(∧ (Integer !x true) (≥ !x 0 true))
			(∧ (Integer !y true) (> !y 0 true))
			(bktrkcond ((0 1) (try))
				   ((!z !w) (try (Sub1 !z !z1) 
						 (Fact !z1 !w1) 
						 (* !z !w1 !w)  ) )
			  ) )
.apart

.end
.cent(Section IV. Correctness of the Mapping to Intermediate Form)

In (18), we  show that the semantics of an input specification is "sufficiently"
preserved in the mapping to intermediate form.  By "sufficiently" we mean that
the semantics of the intermediate form of a function definition is implied by the
semantics of the input specification, and furthermore, if every  function
(or type) %3P%1 is defined such that for each input tuple, %g¬%3x%4i%1,
(i.e., tuple with which the function is called) there exists a unique 
output tuple, %g¬%3x%4o%1,
such that %3P(%g¬%3x%4o%3)%1 is derivable using the input semantics, 
then %3P(%g¬%3x%4o%3)%1 is derivable using the semantics of the intermediate
form of  the definition.  We refer to this restriction of requiring uniqueness
of the output tuple by saying that we are defining only "functional" relationships.

The semantics of input and internal form are not equivalent because the input
semantics may be used several ways to generate proofs.  This is due to the 
nondeterministic nature of the process of finding proofs.  At any stage of a 
proof, there may
be many Horn clauses that are applicable as axioms, and different proof paths
may generate different results.  However, we have restricted ourselves
to defining functional relationships rather than the more general "relations"
which may be one-to-many mappings.  Thus, no matter what direction the proof
takes, there is a unique result. 

  In the intermediate form we are forced to 
attempt the alternatives in a specific order.
Due to the determinism introduced we will follow precisely one proof, the same proof,
every time a procedure is called with the same input values.  The important 
point to establish is that whenever a unique answer may be found in the 
nondeterministic proof process, our deterministic search will find it as well.

  The precondition "guards" execution of a program, ensuring that the program
will terminate.  Thus we need not worry about forever following a "bad"
computation path.  Since there is a unique answer-substitution derivable
using the specification, we don't care that the computation paths may be
different, we need only find one terminating path.
.cent(Section V. Conclusions and Further Research)

The system described provides a valid way
of generating correct programs.
The system is "reasonably" target-language-independent, qualified
only because it is easier to translate to some languages than
to others.  For ease of translation, the target language should allow recursion.
The "back-end" necessary to translate to a non-recursive language, although
certainly feasible, would be more complicated than that for a language
allowing recursion.
We feel this is almost no restriction
at all since we believe that recursion is important enough to be a minimum 
requirement for any language to be considered "reasonable".

The type structure of a target language is an important factor in determining
the ease of adding the language to the system.  The simplest languages to deal with
are those that are type-free, allowing type specifications to be translated
into functions that are recognizers for the type being defined. In a typed
language, one often feels the need (or desire) for polymorphic functions as
allowed by LCF (24).
A simple
example of a function that is by nature polymorphic is a symbol table
lookup.  The type of the result of a lookup should agree with the type of
the variable being looked up.  However, there may be several
different types of variables and values in the table at any time.
The strong typing of Pascal complicated its implementation for this system. In
(18) we discuss the implementations of LISP and Pascal, and describe
how to add a new target language to the system.

The use of generic functions in the specifications
is a feature making it possible to synthesize
 several programs from a single specification.
Generic functions also provide a convenient tool in defining other functions.
The portability of the system is evidenced by its ability to generate a
version of itself.
This ability is also an
indication that the system can handle large "practical" problems.  Several sample
programs have been generated and are listed in (18).

Extensions to the system and subjects of further research have
suggested themselves along the way.  
We would like to add a front-end to the system that would allow more natural input.
This includes several extensions of the syntax of the specification language.
We can extend the syntax of Horn clauses to allow mixed conjunctions
and disjunctions of positive literals on the right
hand side of the arrow. An If-Then-Else might also be allowed.

It would be nice to allow full use of Predicate Calculus in the
specification of a function.  This is a much more difficult problem.
Negation, in particular, is difficult to handle without making assumptions
decidability and completeness.  Synthesis
of programs from general descriptions is being studied by several researchers
(5, 11, 12, 13, 14, 15, 19, 34, 35).
Their results may be incorporated in this system in the future.

We would like to include an interactive environment that helps the user
derive the specifications for a program and prove the correctness of the 
specifications. The system could check the completeness of the specification
by ensuring that at least one Horn clause is applicable to every element
of the domain of the function as specified by the precondition.  This is
difficult in general but for inductively defined domains it may simply be a
reminder to the programmer that they include basis and constructed elements.
To prove the correctness of a specification one must: 1)Prove each
Horn clause is a theorem in the problem domain; 2) Prove that the precondition
guarantees termination of the program;
 and 3) Prove that successful termination of the program always results
in an answer satisfying the postcondition, and that the answer produced is 
unique.  
It is of course too much to expect that the system be capable of 
doing all of this on its own, but a semi-automatic verifier or proof checker
would be useful.

We would like to include a
way of indicating that selective subgoal backtracking is desired.  This
would give us the ability to make statements about the existence of an answer
that satisfies several subgoals simultaneously.  Each subgoal may be satisfied
by a set of answers; we want an element in the intersection of these sets.

The most dramatic improvement to the system would be the development of a 
program that could automatically generate the mapping from intermediate to
target language from a formal specification of the syntax and semantics
of the target language.  

The system described is unique in its ability to generate programs in more
than one language. Moreover, it represents a large, non-trivial application 
of formal specification techniques, generating a version of itself from a
formal specification.
.cent(References)
.begin "bib" indent 0,16;single space;turn on"\#";tabs 12
.AT NULL ⊂IF FILLING THEN BREAK ELSE SKIP 1⊃;
.fp
[1]\#Allen, J. R., %3Anatomy of LISP%1. McGraw-Hill Publishing Co.,
New York, New York, 1978.
.pt24
.fp
[2]\#Andreka, H. and I. Nemeti, "The Generalised Completeness of Horn 
Predicate-Logic as a Programming Language", D.A.I. Research
Report No. 21, Department of Artificial Intelligence, University
of Edinburgh, March, 1976.
.PT24
[3]\#Balzer, R. M., N. M. Goldman, and D.S. Wile, "Informality in
Program Specifications", University of Southern California
Information Sciences Institute, ISI/RR-77-59, April, 1977.
.pt24


		


[4]\#Bauer, F. L., H. Partsch, P. Pepper, and H. W%9:%1ossner, "Notes on the Project
CIP: Outline of a Transformation System", TUM-INFO-7729,
Institut Fur Informatik, Technische Universit%9:%1at M%9:%1unchen,
July, 1977.
.pt24

[5]\#Beach, B., "Transforming Predicate Calculus Statements Into Horn Clauses",
Senior Thesis, University of California, Santa Cruz, June, 1979.
.pt24


[6]\#Biermann, A. W., and R. Krishnaswamy, "Constructing Programs From
Example Computations", %3IEEE Transactions on Software Engineering%1,
vol. 2, no. 3, Sept., 1976.
.pt24


[7]\#Buchanan, J. R., "A Study in Automatic Programming", AIM-245, STAN-CS-74-458,
Ph.D. Thesis, Stanford University, May, 1974.
.pt24

[8]\#Buchanan, J. R., and D. C. Luckham, "On Automating the Construction of Programs",
SAIL AIM-236, STAN-CS-74-433, Stanford University, May, 1974.
.pt24



[9]\#Burstall, R. M., and J. Darlington, "Some Transformations for Developing
Recursive Programs", %3Proceedings of the International Conference
on Reliable Software%1, Los Angeles, Ca., 1975.
.pt24


[10]\#Burstall, R. M., and J. Darlington, "A Transformation System for Developing
Recursive Programs", D.A.I. Research Report No.19,
University of Edinburgh, March, 1976.
.pt24



[11]\#Clark, K., "Synthesis and Verification of Logic Programs", Research report,
CCD, Imperial College, 1977.
.pt24


[12]\#Clark, K., and J. Darlington, "Algorithm Classification Through Synthesis",
Imperial College of Science and Technology, London, June, 1978,
to appear in %3Computer Journal%1.

.pt24



[13]\#Clark, K., and S. Sickel, "Predicate Logic: A Calculus for Deriving
Programs", %3Proceedings of the Fifth International Joint Conference
on Artificial Intelligence%1, Cambridge, Mass., Aug., 1977.
.pt24


[14]\#Darlington, J., "Applications of Program Transformation to Program 
Synthesis", %3IRIA, Proceedings of the Symposium on Proving
and Improving Programs%1, Rocquencourt, France, 1975.
.pt24
[15]\#Darlington, J., "Program Transformation and Synthesis: Present Capabilities",
DAI Report No. 48, University of Edinburgh, Research Report No. 77/43,
Imperial College of Science and Technology, Dept. of Computing and
Control, Sept., 1977.
.pt24


[16]\#Darlington, J., and R. M. Burstall, "A System Which Automatically Improves
Programs", %3Acta Informatica%1, vol. 6, pp.41-60, 1976.
.pt24

[17]\#Davis, R. E., "Deduction, Truth, and Computation", Master's Thesis,
San Jose State University, San Jose, Ca., 1976.

.pt24
[18]\#Davis, R. E., "Generating Correct Programs from Logic Specifications",
####, Ph.D. Thesis, University of California, Santa Cruz, 1979.

.pt24

[19]\#Dershowitz, N., and Z. Manna, "On Automated Structured Programming",
%3IRIA, Proceedings of the Symposium on Proving
and Improving Programs%1, Rocquencourt, France, 1975.
.pt24
.fp
[20]\#van Emden, M. H., and R. A. Kowalski, "The Semantics of Predicate Logic
as a Programming Language", Memorandum MIP-R-103, School of
Artificial Intelligence, University of Edinburgh, Feb., 1974.

.pt24
[21]\#Franusich, M., "Analysis of Logic Programs for Static and Dynamic Subgoal
Selection", Senior Thesis,
University of California, Santa Cruz, June, 1979.
.pt24

[22]\#Gerhart, S. L., "Correctness-Preserving Program Transformations", %3Proceedings
of the 2nd ACM Symposium on Principles of Programming Languages%1,
pp.54-66, January, 1975.
.pt24


[23]\#Goldman, N. M., R. M. Balzer, and D. S. Wile, "The Inference of
Domain Structure from Informal Process Descriptions", %3Proceedings
of the Workshop on Pattern-Directed Inference Systems%1,
Honolulu, May, 1978.
.pt24
[24]\#Gordon, M., R. Milner, and C. Wadsworth, "Edinburgh LCF", Internal Report
CSR-11-77 Part 1, Dept. of Computer Science, University of 
Edinburgh, Sept 1977.
.pt24

[25]\#Green, C. C., "A Summary of the PSI Program Synthesis System",
%3Proceedings of the Fifth International Joint Conference on
Artificial Intelligence%1, Cambridge, Mass., August 1977.      

.pt24


[26]\#Hardy, S., "Synthesis of LISP Functions from Examples", %3Proceedings of
the Fourth International Joint Conference on Artificial
Intelligence%1, Tbilisi, Georgia, U.S.S.R., Sept., 1975.
.pt24


[27]\#Heidorn, G. E., "Automatic Programming Through Natural Language
Dialogue: A Survey", %3IBM Journal of Research and Development%1,
vol. 20, no. 4, July, 1976.
.pt24


[28]\#von Henke, F., "On Generating Programs from Data Types: An Approach
to Automatic Programming",
%3IRIA, Proceedings of the Symposium on Proving
and Improving Programs%1, Rocquencourt, France, 1975.
.pt24

[29]\#Igarashi, S., R. L. London, and D. C. Luckham, "Automatic Program
Verification I: A Logical Basis and its Implementation",
%3Acta Informatica%1, vol. 4, pp. 145-182, 1975.


.pt24


[30]\#Kowalski, R., "Predicate Logic as Programming Language", %3Proceedings
IFIP 74%1, North-Holland Publishing Company, 1974.

.pt24


[31]\#Lee, R. C. T., and S. K. Chang, "Structured Programming and
Automatic Program Synthesis", %3Proceedings of a Symposium on
Very High Level Languages%1, Santa Monica, Ca., 1974.
.pt24


[32]\#Lenat, D. B., "Synthesis of Large Programs from Specific Dialogues",
%3IRIA, Proceedings of the Symposium on Proving
and Improving Programs%1, Rocquencourt, France, 1975.
.pt24

[33]\#Loveman, D. B., "Program Improvement by Source-to-Source Transformation",
%3Journal of the ACM %224:1%1, pp.121-145, 1977.
.pt24


[34]\#Manna, Z. and R. Waldinger, "The Automatic Synthesis of Recursive Programs",
%3Proceedings of the Symposium on AI and Programming Languages%1,
SIGPLAN/SIGART NOTICES/NEWSLETTER Vol. 12, No. 8, August, 1977/
No. 64, August, 1977.
.pt24

[35]\#Manna, Z. and R. Waldinger, "Synthesis: Dreams => Programs", AIM-302,
STAN-CS-77-630, Stanford University, November, 1977.
.pt24

[35]\#McCarthy, J., and C. Talcott, %3LISP Programming and Proving%1.
To be published.



.pt24
[37]\#Sickel, S., "Invertibility of Logic Programs", Technical Report No. 78-8-005,
University of California, Santa Cruz, August, 1978.

.pt24

 
[38]\#Slagle, J. R., "Experiments with a Deductive Question-Answering Program",
%3Communications of the Association for Computing Machinery%1, Vol. 8,
No. 12, Dec., 1965.
.pt24


 
[39]\#Summers, P. D., "A Methodology for LISP Program Construction from Examples",
%3Journal of Association for Computing Machinery%1, vol. 24, no. 1,
Jan., 1977.
.pt24

 

 
[40]\#Wile, D. S., R. M. Balzer, and N. M. Goldman, "Automated Derivation
of Program Control Structure from Natural Language Program
Descriptions", %3Proceedings of the Symposium on Artificial 
Intelligence and Programming Languages%1, University of Rochester,
N.Y., Aug., 1977.
.pt24

 
[41]\#Ulrich, J. W., and R. Moll, "Program Synthesis by Analogy",
%3Proceedings of the Symposium on Artificial 
Intelligence and Programming Languages%1, University of Rochester,
N.Y., Aug., 1977.
.pt24

.end "bib"